Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbolic formatted pretty-printing #615

Merged
merged 7 commits into from Apr 3, 2017

Conversation

rbonichon
Copy link
Contributor

This is based on PR #595. This allows pretty-printing without low-level output: the pretty-printing engine outputs symbolic pretty-printing items in a specific symbolic buffer.

At the end of pretty-printing, the contents of the symbolic buffer can post-processed (and modified) to really output characters on the low-level output device. In particular, it subsumes PR #506 since it gives the user complete control of flushes and low-level output, still preserving the compositional and incremental nature of Format-based pretty-printing .

This is joint work with @pierreweis

box fits on the current line, otherwise every break hint splits the line,
- within an {e compacting} box, a break hint never splits the line, unless
there is no more room on the current line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should stress that the break hints are the one 'directly within' the box not the one inside inner boxes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this is often unclear for many beginners. I proposed to
rephrase as follows:

Each different pretty-printing box kind introduces a specific line splitting
policy:

  • within an {e horizontal} box, break hints never split the line (but the line
    may be split in a box nested deeper),
  • within a {e vertical} box, break hints always split the line,
  • within an {e horizontal/vertical} box, if the box fits on the current line
    then break hints never split the line, otherwise break hint always split the
    line,
  • within an {e compacting} box, a break hint never splits the line,
    unless there is no more room on the current line.

Note that line splitting policy is box specific: the policy of a box does not
rule the policy of inner boxes. For instance, if a vertical box is nested in
an horizontal box, all break hints within the vertical box will split the
line.

-(** {6 Boxes} )
+(
* {6 Pretty-printing boxes} )
+
+(
* The pretty-printing engine uses the concepts of pretty-printing box and

  • break hint to drive the indentation and the line splitting behavior of the
  • pretty-printer.
  • Each different pretty-printing box kind introduces a specific line splitting
  • policy:
    +- within an {e horizontal} box, there is no line splitting,
    +- within a {e vertical} box, every break hint splits the line,
    +- within an {e horizontal/vertical} box there is no line splitting if the
  • box fits on the current line, otherwise every break hint splits the line,
    +- within an {e compacting} box, a break hint never splits the line, unless
  • there is no more room on the current line.

You should stress that the break hints are the one 'directly within' the box not the one inside inner boxes.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/ocaml/ocaml/pull/615/files/d49e17f82307d7cf396cb6cc27b9fa5ddff0cefd#r66835044

@alainfrisch
Copy link
Contributor

The new field for formatter_out_functions will break any code that creates such a record "from scratch" (i.e. without overriding some fields of an existing record). At least for the future, it would be nice to expose a builder function so that the record can be further extended without breaking client code (using optional arguments for new fields).

@pierreweis
Copy link
Contributor

Alain, yes indeed it is conceivable that someone does build a
formatter_out_functions record from scratch, but this is really not probable,
since the easy way to modify the record is indeed to use a with clause.
In any case, modifying such an hypothetical piece of code will be easy using
a with clause and the resulting code will be even clearer.

Also note that this record has not been extended for decades and the new
field is the last semantic action that the pretty-printer indeed performs
that was not part of the output function record.

I think the best way to achieve a future proof API would be to abstract the
record and only provide functions to modify an existing record. But this
would break even more existing code.

The new field for formatter_out_functions will break any code that creates such a record "from scratch" (i.e. without overriding some fields of an existing record). At least for the future, it would be nice to expose a builder function so that the record can be further extended without breaking client code (using optional arguments for new fields).


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#615 (comment)

@alainfrisch
Copy link
Contributor

I've checked LexiFi's code base. We have three occurrences where we build a custom formatter_out_functions record. Two of them list explicitly all the fields; one uses the record override form. Actually, this last form would be the trickiest one, since it should semantically be adapted to override out_indent as well but the compiler would not complain. This could lead to subtle pretty-printing bugs. The other two places are indeed trivial to fix, but this makes it impossible to have code compatible with previous versions.

We are generally very conservative with breaking changes in the stdlib (more than I usually wish). It would be worth checking on public OPAM packages to find out the extent of the breakage.

Can you elaborate on how this change is related to the title of the proposal? Without looking at the details, it seems to me that making a distinction between ident and spaces is not strictly related to that, or is it?

I think the best way to achieve a future proof API would be to abstract the record and only provide functions to modify an existing record. But this would break even more existing code.

Well, it seems almost any code that manipulates formatter_out_function would need to be adapted anyway; so perhaps it is best breaking compatibility frankly, if only to prevent silent bugs (as described above).

One could also provide a builder function (easy to extend with optional arguments) and mark the concrete record as being deprecated, without actually dropping it.

@rbonichon
Copy link
Contributor Author

@alainfrisch I have run an over-approximated analysis over all opam packages for 4.03.
Only 3 packages reference the record fields and only 1 constructs it directly (utop). The very same construction appears in OCaml's testsuite and is fixed in #595.

See the details in log_analysis.txt

rbonichon and others added 5 commits April 3, 2017 15:38
- Typos
- English
- Extended documentation for semantic tags
Before this commit, indentation was identical to outputting blanks.
This is generally true but not always: in some applications you may want to use
'\t' as indentation but not for blanks, ...

The `formatter_out_functions` record is extended with an extra settable field
`out_indent`.

This allows the user to control this aspect of pretty-printing.

The teststuite has been update to reflect this modification.
This function is broken.

Before you use this function, you might be in either one of these cases:
1. There are no pending open boxes/tags. In this case, flushing the stack has no
   effect.
2. You still have some open boxes and/or tags. Thus the function call flushes
   them. Now, if you still had some pretty-printing to do afterwards, it makes
   no sense to have called this function.
   Say for the sake of the argument that only 1 box was left open before the
   call:
   - Either the rest (e.g. continuation) of your pretty-printing routine assumed
     that the box was *closed*: now it is broken.
   - Or it *knew* the box was open. The question is now why does it not close it
     itself (or pass this obligation to yet another part of the pretty-printing
     routine) ?

In either case, this function should *not* be used and thus it should not be
exposed to the user. Let us state here that it breaks the invariants upon which
the Format module has been based.
In this function, we repeatedly close the open tags.
This function is now called when resetting the pretty-printing engine.

This is a fix: before this commit, tags were never closed when flushing
the pretty-printing engine. Actually, tags were *only* removed from the
the tag stack.

Detail: opened -> open.
pierreweis and others added 2 commits April 3, 2017 15:44
- Some addition for formatted pretty-printing.
- Some simplification and precision; in particular, explicitely naming the
  standard pretty-printer when we are talking about the standard pretty-printer
@gasche
Copy link
Member

gasche commented Oct 21, 2017

Update: this message is outdated (see Octachron's comment below).

@pierreweis, @rbonichon: this change lacks a Changes entry, so it is not advertised to users of the release -- and they don't know where to look for a reference to this PR if they want to understand the design rationale, etc.

Would it be possible for you to submit a Changes entry (as a new PR), following the CONTRIBUTING.md#changelog guidelines?

At least the following people should be mentioned in the "review by" credits: Gabriel Radanne and Florian Angeletti (that is, Drup and octachron, the first did a review of the patch and the second fixed since-tags and documentation issues after the fact).

Given that we are planning to release by the end of the month, it would be very nice if you could do this (small thing) next week. If you think you will not have the time, please let us know as soon as possible, and @Octachron or myself will write something.

@Octachron
Copy link
Member

@gasche , in fact I had already added a Change entry for the symbolic printer feature
https://github.com/ocaml/ocaml/blob/trunk/Changes#L135 ,
which even contained a short mention of the formatter_out_functions change,
but I had somehow missed the link with #595 and Drup's review.
Sorry for the resulting confusion.

Nevertheless, it would be probably clearer to split the change entry in two: one entry for the symbolic printer and another one fot the out_indent field.

@gasche
Copy link
Member

gasche commented Oct 31, 2017

One of the packages that breaks because of the new record field is notty : see pqwy/notty#17. I'm linking the issue here for reference.

@pqwy
Copy link

pqwy commented Oct 31, 2017

It would not break if not for this.

Having

let display_indent state = state.pp_out_spaces

would enable code to ignore the new feature while retaining the old behavior, which is presently impossible.

That is, the new operation should be defined in terms of the old operation it is partially replacing.

@rbonichon
Copy link
Contributor Author

@pqwy This is exactly the proposal of #1382

There are also drawbacks to keeping the old behavior as you suggest, as I explain in the discussion of said PR. Basically indenting is a separate concept from outputting spaces (think of the tab vs space debate for indentation).

pqwy added a commit to pqwy/notty that referenced this pull request Oct 31, 2017
Adapt to breakage in ocaml/ocaml#615.

Affects `I.strf`. If the background attribute leaks over the leading
space on 4.06, this is it. Full fix requires dropping < 4.06 support.

Fixes #17.
EduardoRFS pushed a commit to esy-ocaml/ocaml that referenced this pull request Jul 23, 2021
stedolan pushed a commit to stedolan/ocaml that referenced this pull request May 24, 2022
* Random reordering of CFG layout

* Add a compiler flag -reorder-blocks-random seed and use it in asmgen
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Co-authored-by: tmattio <tmattio@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants